Suleiman D., Awajan A., 2018. Comparative study of word embedding models and their usage in Arabic language applications. Proceedings of the 19th International Arab Conference on Information Technology (ACIT'2018). 28-30 November 2018. Beirut, Lebanon.

Abstract

Word embeddings is the representation of the text using vectors such that the words that have similar syntax and semantic will have similar vector representation. Representing words using vectors is very crucial for most of natural language processing applications. In natural language when using neural network for processing, the words vectors will be fed as input to the network. In this paper, a comparative study of several word embeddings models is conducted including Glove and the two approaches of word2vec model called CBOW and Skip-gram. Furthermore, this study surveying most of the state-of-art of using word embeddings in Arabic language applications such as sentiment analysis, semantic similarity, short answer grading, information retrieval, paraphrase identification, plagiarism detection and Textual Entailment.